我们考虑测定点过程(DPP)的产物,该点过程,其概率质量与多矩阵的主要成本的产物成比例,作为DPP的天然有希望的推广。我们研究计算其归一化常量的计算复杂性,这是最重要的概率推理任务。我们的复杂性 - 理论结果(差不多)排除了该任务的有效算法的存在,除非输入矩阵被迫具有有利的结构。特别是,我们证明了以下内容:(1)计算$ \ sum_s \ det({\ bf a} _ {s,s,s})^ p $完全针对每个(固定)阳性甚至整数$ p $ up-hard和Mod $ _3 $ p-hard,它给Kulesza和Taskar提出的打开问题给出了否定答案。 (2)$ \ sum_s \ det({\ bf a} _ {s,s})\ det({\ bf b} _ {s,s})\ det({\ bf c} _ {s,s} )$ IS难以在2 ^ {o(| i | i | ^ {1- \ epsilon})} $或$ 2 ^ {o(n ^ {1 / epsilon})} $的任何一个$ \ epsilon> 0 $,其中$ | i | $是输入大小,$ n $是输入矩阵的顺序。这种结果比Gillenwater导出的两个矩阵的#P硬度强。 (3)有$ k ^ {o(k)} n ^ {o(1)} $ - 计算$ \ sum_s \ det的时间算法({\ bf a} _ {s,s})\ det( {\ bf b} _ {s,s})$,其中$ k $是$ \ bf a $和$ \ bf b $的最大等级,或者由$ \ bf a $的非零表项形成的图表的树宽和$ \ bf b $。据说这种参数化算法是固定参数的易解。这些结果可以扩展到固定尺寸的情况。此外,我们介绍了两个固定参数批量算法的应用程序给定矩阵$ \ bf a $ treewidth $ w $:(4)我们可以计算$ 2 ^ {\ frac {n} {2p-1} $ - 近似值到$ \ sum_s \ det({\ bf a} _ {s,s})^ p $ for任何分数$ p> 1 $以$ w ^ {o(wp)} n ^ {o(1)} $时间。 (5)我们可以在$ w ^ {o(w \ sqrt n)} n ^ {
translated by 谷歌翻译
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also proposed and shown to improve empirical sample complexity. According to our experimental results, the rate-estimation versions of FCB and FTSV, together with that of the popular active learning policy that selects the point with the maximum variance, outperform other policies for synthetic functions, and the version of FTSV is also the best performer for our real-world dataset.
translated by 谷歌翻译
The long-standing theory that a colour-naming system evolves under the dual pressure of efficient communication and perceptual mechanism is supported by more and more linguistic studies including the analysis of four decades' diachronic data from the Nafaanra language. This inspires us to explore whether artificial intelligence could evolve and discover a similar colour-naming system via optimising the communication efficiency represented by high-level recognition performance. Here, we propose a novel colour quantisation transformer, CQFormer, that quantises colour space while maintaining the accuracy of machine recognition on the quantised images. Given an RGB image, Annotation Branch maps it into an index map before generating the quantised image with a colour palette, meanwhile the Palette Branch utilises a key-point detection way to find proper colours in palette among whole colour space. By interacting with colour annotation, CQFormer is able to balance both the machine vision accuracy and colour perceptual structure such as distinct and stable colour distribution for discovered colour system. Very interestingly, we even observe the consistent evolution pattern between our artificial colour system and basic colour terms across human languages. Besides, our colour quantisation method also offers an efficient quantisation method that effectively compresses the image storage while maintaining a high performance in high-level recognition tasks such as classification and detection. Extensive experiments demonstrate the superior performance of our method with extremely low bit-rate colours. We will release the source code soon.
translated by 谷歌翻译
While natural systems often present collective intelligence that allows them to self-organize and adapt to changes, the equivalent is missing in most artificial systems. We explore the possibility of such a system in the context of cooperative object manipulation using mobile robots. Although conventional works demonstrate potential solutions for the problem in restricted settings, they have computational and learning difficulties. More importantly, these systems do not possess the ability to adapt when facing environmental changes. In this work, we show that by distilling a planner derived from a gradient-based soft-body physics simulator into an attention-based neural network, our multi-robot manipulation system can achieve better performance than baselines. In addition, our system also generalizes to unseen configurations during training and is able to adapt toward task completions when external turbulence and environmental changes are applied.
translated by 谷歌翻译
Continual learning (CL) learns a sequence of tasks incrementally. There are two popular CL settings, class incremental learning (CIL) and task incremental learning (TIL). A major challenge of CL is catastrophic forgetting (CF). While a number of techniques are already available to effectively overcome CF for TIL, CIL remains to be highly challenging. So far, little theoretical study has been done to provide a principled guidance on how to solve the CIL problem. This paper performs such a study. It first shows that probabilistically, the CIL problem can be decomposed into two sub-problems: Within-task Prediction (WP) and Task-id Prediction (TP). It further proves that TP is correlated with out-of-distribution (OOD) detection, which connects CIL and OOD detection. The key conclusion of this study is that regardless of whether WP and TP or OOD detection are defined explicitly or implicitly by a CIL algorithm, good WP and good TP or OOD detection are necessary and sufficient for good CIL performances. Additionally, TIL is simply WP. Based on the theoretical result, new CIL methods are also designed, which outperform strong baselines in both CIL and TIL settings by a large margin.
translated by 谷歌翻译
3D点云可以灵活地表示连续表面,可用于各种应用;但是,缺乏结构信息使点云识别具有挑战性。最近的边缘感知方法主要使用边缘信息作为描述局部结构以促进学习的额外功能。尽管这些方法表明,将边缘纳入网络设计是有益的,但它们通常缺乏解释性,使用户想知道边缘如何有所帮助。为了阐明这一问题,在这项研究中,我们提出了以可解释方式处理边缘的扩散单元(DU),同时提供了不错的改进。我们的方法可以通过三种方式解释。首先,我们从理论上表明,DU学会了执行任务呈纤维边缘的增强和抑制作用。其次,我们通过实验观察并验证边缘增强和抑制行为。第三,我们从经验上证明,这种行为有助于提高绩效。在具有挑战性的基准上进行的广泛实验验证了DU在可解释性和绩效增长方面的优势。具体而言,我们的方法使用S3DIS使用Shapenet零件和场景分割来实现对象零件分割的最新性能。我们的源代码将在https://github.com/martianxiu/diffusionunit上发布。
translated by 谷歌翻译
我们为WordPiece提供了子字正规化方法,该方法使用了令牌化的最大匹配算法。提出的方法MaxMatch-Dropout使用最大匹配算法随机将单词随机删除。它通过对流行预审预测的语言模型(例如Bert-Base)的子词正则化实现了Finetuntization。实验结果表明,MaxMatch-DropOut改善了文本分类和机器翻译任务的性能以及其他子单词正则化方法。此外,我们提供了子词正则化方法的比较分析:使用句子(Unigram),BPE-Dropout和MaxMatch-Dropout的子字正则化。
translated by 谷歌翻译
连接派时间分类(CTC)的模型在自动语音识别(ASR)方面具有吸引力,因为它们的非自动性性质。为了利用仅文本数据,语言模型(LM)集成方法(例如重新纠正和浅融合)已被广泛用于CTC。但是,由于需要降低推理速度,因此他们失去了CTC的非自动性性本质。在这项研究中,我们提出了一种使用电话条件的蒙版LM(PC-MLM)的误差校正方法。在提出的方法中,掩盖了来自CTC的贪婪解码输出中的较不自信的单词令牌。然后,PC-MLM预测这些蒙版的单词令牌给定的单词和手机补充了CTC。我们进一步将其扩展到可删除的PC-MLM,以解决插入错误。由于CTC和PC-MLM均为非自动回旋模型,因此该方法可以快速LM集成。在域适应设置中对自发日本(CSJ)和TED-LIUM2语料库进行的实验评估表明,我们所提出的方法在推理速度方面优于重新逆转和浅融合,并且在CSJ上的识别准确性方面。
translated by 谷歌翻译
Connectionist时间分类(CTC)的模型很有吸引力,因为它们在自动语音识别(ASR)中的快速推断。语言模型(LM)集成方法(例如浅融合和重新恢复)可以通过利用文本语料库的知识来提高基于CTC的ASR的识别准确性。但是,它们大大减慢了CTC的推论。在这项研究中,我们建议提炼基于CTC的ASR的BERT知识,从而扩展了我们先前针对基于注意的ASR的研究。基于CTC的ASR在训练过程中学习了BERT的知识,并且在测试过程中不使用BERT,从而维持CTC的快速推断。与基于注意力的模型不同,基于CTC的模型做出了框架级预测,因此它们需要与BERT的令牌级预测进行蒸馏。我们建议通过计算最合理的CTC路径来获得比对。对自发日语(CSJ)和TED-LIUM2语料库的实验评估表明,我们的方法改善了基于CTC的ASR的性能,而无需推理速度成本。
translated by 谷歌翻译
现有的视频域改编(DA)方法需要存储视频帧的所有时间组合或配对源和目标视频,这些视频和目标视频成本昂贵,无法扩展到长时间的视频。为了解决这些局限性,我们建议采用以下记忆高效的基于图形的视频DA方法。首先,我们的方法模型每个源或目标视频通过图:节点表示视频帧和边缘表示帧之间的时间或视觉相似性关系。我们使用图形注意力网络来了解单个帧的重量,并同时将源和目标视频对齐到域不变的图形特征空间中。我们的方法没有存储大量的子视频,而是仅构建一个图形,其中一个视频的图形注意机制,从而大大降低了内存成本。广泛的实验表明,与最先进的方法相比,我们在降低内存成本的同时取得了卓越的性能。
translated by 谷歌翻译